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Abstract 

Context-free  attribute  grammars  are  proposed 
as  derivational  models  for  proofs  In  the  predicate 
calculus.  The  new  representation  Is  developed  and 
Its  correspondence  to  resolution-based  clause  In- 
terconnectivity graphs  Is  established.  The  new 
representation  may  be  useo  to  transform  a predi- 
cate calculus  characterltatlon  of  a problem  Into  a 
regular  algebra  cha>‘acter1zat1on  of  the  solutions. 

The  new  representation  can  be  used  to  simplify 
the  search  for  proofs.  It  allows  us  to  express  and 
derive  predicate  calculus  proofs  as  a constraining 
function  that  serves  as  a filter  to  the  set  of 
candidate  proofs  that  Ignore  the  arguments  to 
predicates.  The  effect  of  this  Is  to  separate  the 
underlying  propositional  structure  from  the  re- 
strictions Imposed  by  the  required  unifications. 

While  previous  theorem  proving  methods  have 
been  able  to  enumerate  all  proofs  of  a theorem, 
the  method  reported  here  Is  unique  In  being  able 
to  characterize  all  proofs  of  some  theorems,  re- 
presenting even  an  Infinite  set  of  proofs  with  a 
finite  formula.  This  work  has  implications  for 
proof  theory  as  well  as  providing  a useful  tool  In 
the  analysis  of  programs  specified  In  logic. 

1.  Introduction 

This  section  gives  definitions  of  clause 
Interconnectivity  graphs  and  context-free  gratimars. 
The  definitions  have  been  extended  where  needed  to 
express  the  additional  structure  treated  In  this 
paper.  A more  detailed  description  of  clause 
Interconnectivity  graphs  Is  given  by  S1cke1[4], 
Coemon  definitions  In  theorem  proving  used  here 
are  given  by  Chang  and  Lee[1]. 

A substitution  o Is  a set  of  ordered  pairs 
[tl/xl,  t2/xZ, . . .tn/xn]  where  each  t1  Is  an  arbi- 
trary term  and  each  x1  Is  a distinct  variable. 

For  an  arbitrary  literal  L,  Lo  denotes  the  literal 
L with  all  occurrences  of  x1  replaced  by  ti  for 
Isisn.  Simlllar  definitions  apply  for  terms  t., 
and  clauses  Co. 

A directed  substitution  1$  given  by 

where  e Is  a substitution  [tl/xl,..., tn/xn],  and 
tl  and  s2  form  a partitioning  of  the  variables  of 
6.  For  example,  lf(y)/x,  g(a)/z]jj^jjy^^j  Is  a 

directed  substitution. 

A variant  a,  . of  a directed  substitution 


®sl  s2  * substitution  derived  from  o In  which 
each  variable  y ( si  Is  replaced  by  y^  In  e,  and 
each  variable  z ( s2  Is  replaced  by  Zj  In  e. 

Variant  replaces  only  variables  from  si;  var- 
lant  a . replaces  only  variables  from  s2.  a 

”»j  -,- 

■ 0. 

Given  a set  of  clauses  that  are  variable  dis- 
joint, we  can  construct  a clause  Interconnectivity 
graph  (CIG).  A CIG  Is  a quadruple: 

<Nodes.  Edges,  Subst,  C1ause>  where 
Nodes  Is  a set  of  graph  nodes,  one  for  each 
literal  In  each  clajje.  Even  If  two  literals 
In  separate  clauses  are  Identical,  they  cor- 
respond to  different  nodes. 

Edges  Is  a sytnmetric  relation  between  pairs 
of  nodes  such  that  <A,B>  ( Edges  Iff  the  lit- 
erals associated  with  nodes  A and  B have  op- 
posite signs  and  uniflable  atoms.  We  write 
AoB  If  <A,B>  € Edges  and  .A<+>B  if  either 
A<>B  or  If  3C,0  such  that  AoC,  C c Clause(D) 
but  CfO,  and  D<'t>B. 

Subst  Is  a mapping:  Edges  - Directed  Substl- 
tutlons  such  that  Subst(<A,B>)*0j^  ^2  “here  0 

Is  a most  general  unifier  of  the  atoms  of  the 
literals  associated  with  nodes  A and  B,  si  Is 
the  set  of  variables  appearing  In  both  A and 
0,  and  s2  Is  the  set  of  variables  appearing 
In  both  B and  0.  In  the  ground  case,  Subst 
maps  to  the  empty  substitution. 

Clause  Is  a mapping;  Nodes  - Powerset(Nodes) . 
Clause  partitions  the  nodes  so  that  literals 
In  the  same  clause  have  corresponding  nodes 
In  the  same  partition. 

The  start  clauses  are  one  or  more  clause  par- 
titions which  may  be  chosen  arbitrarily.  Us- 
ually they  are  the  clauses  '-epresentlng  the 
negation  of  the  theorem  to  be  proved. 

Residual  1 Iterals  Is  a mapping: 

Nodes  «Powerset( Nodes)  where 
Res1dual_11tera1s(B)  ■ Clause(B)  - (B). 

For  example.  Figure  1 shows  a simple  CIG. 
Throughout  the  discussion  of  this  example  we  refer 
to  nodes  by  the  literals  they  represent.  If  two 
or  more  literals  were  Identical,  It  would  be  neces- 
sary to  distinguish  between  them.  For  this  example: 

Nodes  ’ the  set  of  circled  literals 

Edges  ■ (<A,B>|  A and  B are  connected  by  an 
e^e  of  the  graph}.  I.e.,  (<A(x),  A(y)>, 

<*(y),  A(x)»,  etc.} 

A(x)<+>C(g(u))  since  A(x)<>S(y),  J(y)  ( 
C1aute(C(g(y)))and  C(g(y))<>C(g(u)). 

Intuitively  P<+>Q  means  that  from  P we  can 
get  to  Q by  first  traversing  an  edge  and  then 
alternating  between  selecting  a residual  of 
the  destination  and  from  there  traversing  an  * 


1 


tdge,  etc. 


Subtt  IMPS  each  edge  to  ajdirected  substitu- 
tion, ej.,  Subst(<A(x),  A(y)>)»[x/y],  w 

Subst(<iTyi.  A(x)>).[x/y]^yj,^j.  etc.(*)‘y) 


Undirected  substitutions  are  shown  on  the  un- 
directed edges  of  the  figure. 


Clause  maps  each  node  to  the  set  of  nodes  In 
the  «sme  danse,  e.g. , 

C'lauseWfIz)))  • {A(f(z)),  C(g(u)),  0(w)}. 

Res1dua1_11tera1s  maps  a node  to  the  other 
nodes  In  the  same  clause,  e.g., 
Res1dual_11terals(?f(f(z)))  • (C(g(u)),  f(w)}. 


A CIG  for  clauses  {A(x)  B(f(x)), 


Continuing  with  the  definitions, 

Unifying  composition  (O)  is  a mapping: 
Subst1tut1onS‘<Subst1  tut  Ions  •»  Substitutions 
where  oOe  ■ y such  that  y Is  a most  general 
unifier  satisfying,  for  an  arbitrary  term  t, 

(to)Y  * (ty)a  • ty  • (t0)y  ■ (ty)B. 

If  no  such  Y exists,  aOe  Is  undefined.  G Is 
connutative  and  associative,  I.e., 
oGb  ■ B©a 
0©(BGy)  * (oGb)0y 

Composition  of  substitutions  Is  normally  de- 
fined In  the  context  of  applying  substitutions 
sequentially.  However,  for  this  application 
we  are  looking  for  evidence  that  substitutions 
are  compatible.  Here,  substitutions  may  be 
applied  In  any  order;  therefore  they  must  be 
connutative  and  associative. 


Suppose  we  have  a sequence  of  edges  e1=><A1,B1> 
Isisn,  such  that  A1  * Bn  and  for  Islcn, 

A(1+lj  ( Residual  1 1terals(B1 ) , and  for  Ifj 
Ai  t Clause(Aj).  Then  the  sequence  leads  from 
node  AI  back  to  Itself  (I.e.,  A1<+>A1)  without 
visiting  any  other  clause  more  than  once.  The 
^te  node  Is  AI.  If  the  substitutions  along 
ihe  way  allow  us  to  use  the  same  Instance  of 
AI  at  both  ends  of  the  sequence  (I.e.,  If 
a ■ Subst(e1 ) C/Subst(e2)>.  > . . . OSubst(en)  Is 
defined)  then  m is  called  a merge  loop. 

Sub(m),  the  substitution  of  the  loop.  Is  the 
directed  substitution  a ^ j such  tnat  si  Is 


the  list  of  variables  from  the  clause  of  base  node 
AI  and  s2  Is  the  list  of  all  other  variables  ap- 
pearing In  the  loop. 

For  a merge  loop  <A1,B1>,  <A2,B2>,  ...<An,Bn> 
the  undeleted  residuals  » (L | L ( Residual 
llterals(Bk)  -(A{k+l)}  where  lsksn-1). 

In  order  to  find  proofs,  we  must  find  ways  to 
delete  all  literals  In  a start  clause.  If  a CIG 
Is  a tree  to  begin  with,  then  It  models  a proof  In 
which  the  root  Is  the  start  clause.  However,  If 
the  CIG  Is  not  a tree,  then  It  must  be  made  Into 
one.  For  example. 


The  definitions  for  deletion  tree  and  solution  tree 
do  that  transformation,  along  with  checking  for 
substitution  consIstenLy. 

A deletion  tree  for  node  N.  denoted  T(N)  Is: 

a)  a finite  tree 

(1)  having  root  N 

(11)  where  N has  a single  child  L such  that 
<N,L>  ( Edges 

(111)  and  L has  k children  T(L1),  ...  T(Lk) 
where  Residual  llterals(L)  • 
(L1,...,Lk) 

(1v)  and  Sub(T(N))  • Subst(<N,L>)_  ji: 
Subj(T(Ll))0...OSubj(T(Lk)r  Is 

defined  where  j Is  an  Integer  not  used 
beforet,  and  where  Sub.(T(N))  • 
Sub(T(N))  with  all  previously  unsub- 
scripted  variables  now  subscripted 
with  1.  The  subscripts  are  used  to 
distinguish  between  different  occur- 
rences of  the  variables. 

Note:  If  k*0,  then  L Is  a leaf. 

b)  a finite  tree 

(1)  having  root  N 

(11)  where  N has  a single  child  m such  that 
m Is  a merge  loop  with  base  node  N 

(111)  and  m has  k children  T(L1),  ...  T(tk) 

. where  (L1,...Lkl  are  the  undeleted 
residuals  of  m 

(1v)  and  Sub(T(N))  ■=  Sub(m)_  j ^Subj(T(L1)) 
0..£Subj(T(Lk))  Is  defined  where  j 

Is  an  Integer  that  has  not  been  used 
before  and  where  Sub.(T(N))  Is  as  In 
case  a. 

Note;  If  k»0,  then  m Is  a leaf. 


t The  reason  for  this  condition  Is  to  guarantee 
that  subtrees  T(L1)...T(Lk)  do  not  have  common 
renaming  constants.  The  choice  of  J Is  by  a 
global  function  similar  to  6ENSYN  In  LISP. 
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c)  derived  only  by  means  of  a)  and  b). 

A solution  tree.  Ts{C),  for  a CIG  C Is  a tree 
such  thefi 

(1)  It  has  root 

(11)  and  has  as  children  deletion  trees 
for  all  the  literals  of  a start  clause 
(LI Lk). 

(Ill)  and  Sub(Ts(0)  * Subj(T(Ll  ))e . . .O 

Subj(T(Lk))  Is  defined  where  J Is  an 
Integer  that  has  not  been  used  before. 

Again,  consider  thejaxamole  In  Figure  1.  The 
seguence  <A(x),  X(y)>,  <C(g(y)),  C(g(u))>, 
<A(f(z)),  A(x)>  is  a merge  loop.  The  base  node  is 
A(x).  The  substitution  of  the  loop  is 

■ [^(t)/x.  f(t)/y.  ^t*)/‘‘3(x)(u,y,z)’ 
only  undeleted  residual.  For  the  start  clause 
{A(x),  B(f(x))},  Figure  2 shows  the  solution  tree 
for  this  example. 


Figure  2:  The  only  solution  tree  for 
Figure  1 with  start  clause  {A(x),B(f(x))} 
Nonempty  substitutions  of  the  subtrees 
are  denoted  at  the  roots. 

A context-free  grammar  Is  a quadruple; 
<Honterm1nals,  terminals,  Productions,  Start  symbol > 
In  which 

1)  Nonterminals,  Terminals  and  Productions  are 
finite  sets 

2)  Nonterminals  n Terminals  = ^ 

3)  Vp  f Productions,  p Is  of  the  form: 

N ♦ si  s2  ...  sk  for  any  finite  k where 
N ( Nonterminals  for  I'isk, 
si  * Nonterminals  li  Terminals. 

N Is  known  as  the  left-hand  side  (l.h.s.) 
and  sl...sk  as  the  right  hand  side  (r.h.s) 
of  the  production  p. 

4)  Start  symbol  ( Nonterminals. 

E.g.,  <N«{S,A),  T'{0,1,2},  P,  S>  where  P Is: 

(S  0 A 2 

S 0 2 

A -»  1 A 
A ♦ 1} 


If  we  always  use  upper  case  Latin  letters  to 
denote  nonterminals  and  never  to  denote  terminals, 
and  restrict  the  start  symbol  to  be  "S",  then  the 
production  set  will  fully  specify  the  gramnar.  Ue 
sometimes  use  this  shortcut  notation. 

For  any  set  of  characters  C,  the  set  of  all 
strings  of  those  characters  Is  denoted  C*.  E.g., 
for  the  above  grammar,  T*  equals  all  strings  made 
up  only  of  symbols  0,  1,  2.  The  empty  string,  a 
string  consisting  of  no  symbols.  Is  denoted  e. 
e € T*  for  all  T. 

If  a and  s are  strings  of  symbols  and  there 
exists  a production  A B1  ...  Bn  and  by  replacing 
a single  occurrence  of  symbol  A In  a by  the  string 
B1  ...  Bn  we  get  e,  then  we  say  a > g.  In  the 
preceding  grammar,  0A2  » 01A2.  If  a and  g are 
strings  such  that  for  finite  r«l 
a > g1  • g2  gn  ■ g,  then  we  say  a can  be  de- 
rived from  g and  denote  It  a*g.  A special  case  of 
this  definition  Is  that  a*a.  S * 0111112  In 

the  previous  gramnar. 

The  language  L(G)  of  a context-free  gramnar 
G • <N,T,P,S>  is  the  set  of  all  terminal  strings 
derivable  from  G,  I.e.,  (x|  x f T*  and  S e x). 

A tree  Is  a derivation  tree[2]  for  G If: 

1)  Every  node  has  a label  which  Is  In  N U T. 

2)  The  label  of  the  root  is  S. 

3)  If  a node  s has  at  least  one  descendant, 
and  s has  label  A,  then  A must  be  In  N. 

4)  If  nodes  si,  s2,...,sk  are  the  direct  de- 
scendants of  node  s.  In  order  from  the  left, 
with  labels  A1,...,Ak  respectively,  then 

A A1  A2...Ak  must  be  a production  In  P. 
Denote  by  0(A)  a derivation  tree  for  a string  de- 
rivable from  nonterminal  A. 

2.  ClG's  to  Gramnars 

Gramnars  provide  concise  representations  for 
very  rich  sets  of  objects.  For  example,  all  com- 
putable functions  are  grammatically  describable. 

The  functions  that  we  are  Interested  in  are  the 
ones  that  take  statements  of  theorems  as  Input  and 
produce  proofs  of  those  theorems  as  output.  This 
paper  attacks  that  problem  by  transforming  the  In- 
put Into  a grammar  that  will  generate  exactly  the 
set  of  proofs  to  the  theorem. 

Ground  case.  In  the  ground  case,  all  substitutions 
are  empty  and  are  therefore  all  mutually  compatible. 
Ignoring  substitutions  simplifies  the  problem  so  we 
will  start  here.  From  a ground  case  CIG: 

C ■ <Nodes,  Edges,  Subst,  Clause>, 
construct  a context-free  grammar  G.  Let 

G • <Nodes  u {S),  Edges,  P,  S> 
where  S t Nodes  u Edges.  P consists  of; 

a)  S L1...Lk  for  each  starting  clause 
ai,...Lk). 

b)  B -»  eCl...Ck  for  each  edge  e*<B,C>  and 
Res1dual_11terals(C)°(Cl ,...Ck). 

c)  B ■»  mLl.T.Lk  for  each  merge  loop  m and  where 
L1...Lk  are  the  undeleted  residuals  of  m. 

For  example,  consider  the  ground  case  CIG  of  Fig- 
ure 3,  The  edges  of  Figure  3 are  named  a-d  and  the 
nodes  A-G  for  purposes  of  exposition.  Let  the 
start  clause  be  (A,B)  and  let  e'  denote  edge  >Y,X> 

If  e denotes  edge  <X,Y>.  Edges  are  directed  to 
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dtslgnate  In  which  direction  the  label  applies. 


Then  G Is; 

<(S,A,B,C,D,E,F,G},  {a.b.c.d},  P,  S>  where  P Is; 


S-»AB  C-*a'B 

A -►  a b c (a  merge  loop)  D -♦  b E 
A-»aO  E-»cB 

A - c'  F F - b'  C 

B •*  d G d'  A 


L(G)  ■ {abcdd,  c'b'a'dd,  abed}. 


A derivation  tree  for  abcdd  Is  shown 


In  Figure  4. 


Figure  4;  Trees  for  abcdd 


The  string  abcdd  corresponds  to  the  proof  in 
which  the  sequence  of  resolutions  corresponding  to 
edges  a,  b,  c,  d,  d are  performed  as  shown  in  Fig- 
ure 4(11),  A-G  are  merely  labels,  not  the  literals 
themselves.  Because  edges  represent  complementary 
pairs,'  A & Z si,  D ^ F,  B s G;  dashed  lines  con- 
nect these  pairs  in  the  proof  for  the  convenience 
of  the  reader.  Each  edge  traversed  is  denoted  at 
the  corresponding  resolution  step,  also  for  con- 
venience. 


Thm  1.  For  the  ground  case,  the  set  of  deletion 
trees  for  node  N of  CIG  C Is  equivalent  to  the  set 
of  derivation  trees  for  strings  derivable  from  non- 
terminal N using  the  productions  of  the  granrar  G 
derived  from  C. 


Proof  The  proof  Is  by  induction  on  the  depth  of  the 
trees , Both  cases  contain  a basis  step  as  an  in- 
stance. There  are  two  operations  for  constructing 
each  of  the  deletion  trees  and  derivation  trees. 

We  shall  show  that  the  two  pair  correspond.  The 
first  production  for  G does  not  apply  since  we  have 
no  way  of  generating  S from  any  other  nonterminal. 
Part  a Construction  rule  a for  deletion  trees 
corresponds  to  production  rule  b of  grammars 
constructed  from  CIG's.  Figure  5 shows  the  cor- 
responding constructions. 


(1)  a construction  (11)  a construction 

rule  for  deletion  rule  for  derivation 

trees  trees 

Figure  5;  A corresponding  pain 
of  construction  rules  for  de- 
letion and  derivation  trees. 

For  k*0,  It  Is  clear  that  L,  as  a child  of  N,  can 
be  mapped  onto  <N,L>  and  vice  versa.  For  k>0,  we 
assume  that  If  T(L1)  Is  equivalent  to  D(L1)  for 
Isisk  then  Figure  S(l)  Is  equivalent  to  Figure 
5(11).  The  only  transformations  to  be  made  are 
between  L and  <N,L>  and  a simple  tree  reshaping. 
Part  b Construction  rule  b for  deletion  trees 
corresponds  to  production  rule  c of  grammars  con- 
structed from  CIG's.  Figure  6 shows  the  corres- 
ponding constructions. 


(1)  a construction  (11)  a construction 

rule  for  deletion  rule  for  derivation 

trees  trees 

Figure  6;  A corresponding  pair 
of  construction  rules  for  de- 
letion and  derivation  trees 

For  k>0  the  two  trees  are  Identical.  For  k>0,  we 
assume  that  If  T(L1)  Is  equivalent  to  D(L1)  for 
Isisk  then  Figure  6(1)  Is  equivalent  to  Figure 
6(11)  with  the  only  required  transformations  being 
a simple  tree  reshaping. 

From  parts  a and  b we  conclude  that  for 
any  deletion  tree  we  can  construct  a corresponding 
derivation  tree  and  vice  versa.  Q.E.D. 

Thm  2 For  the  ground  case,  the  set  of  solution 
trees  of  a CIG  Is  equivalent  to  the  set  of  deri- 
vation trees  for  the  language  of  the  grammar  G 
constructed  from  the  CIG. 

Proof  Figure  7 shows  the  form  of  solution  trees 
and  derivation  trees  for  L(G).  For  each  L1,  l-'l-tk, 
we  can  construct  equivalent  T(Li)  and  0(L1).  It  is 
therefore  trivial  to  show  that  for  any  solution 
tree  we  can  construct  an  equivalent  derivation  tree 
from  S. 
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T(U)  ...  T(lk) 

(1)  The  form  of 
• solution  tree 


0(L1)  ...  D(Lk) 

(11)  the  form  of  a 
derivation  tree  for 
an  element  of  L(G) 


Figure  7:  Forms  of  solution  trees 
and  derivation  trees  for  L(G). 

Thw  3 Every  proof  by  resolution  of  the  unsatlsfl- 
abillty  of  a set  S of  ground  clauses  can  be  mapped 
onto  an  element  In  the  language  L(G)  where  G Is 
the  grammar  constructed  from  the  CIG  C construc- 
ted from  S. 

Proof  Any  proof  by  resolution  of  the  unsatlsfi- 
aBTTTty  of  S can  be  mapped  onto  a solution  tree 
for  C,  [5].  By  Thm  2 we  know  that  every  solution 
tree  for  C can  be  mapped  onto  a derivation  tree  of 
an  element  of  G,  Every  derivation  tree  can  be 
mapped  onto  the  element  of  L(G)  that  consists  of 
the  leaves  of  the  tree  in  the  same  left-to-right 
order. 

Thm  4 (Soundness)  Suppose  G is  the  grammar  deriv- 
etf  from  CIG  C,  derived  in  turn  from  the  set  of 
clauses  S.  If  L(G)  is  non-emptv,  then  S is  unsat- 
Isflable  and  any  element  of  L(G)  can  be  mapped  onto 
a proof  of  the  unsatisfiability  of  S. 

Proof  Suppose  L(G)  contains  some  string  s.  Every 
member  of  L(G)  has  a derivation  tree  that  describes 
the  process  of  deriving  s.  Let  d be  a derivation 
tree  for  s.  From  Thm  2 we  can  map  d onto  a solu- 
tion tree  of  C.  That  solution  tree  maps  onto  a 

froof  by  resolution  of  the  unsatisfiability  of  S, 
5]. 

Thm  5 (Completeness)  If  a set  S of  clauses  is 
unsatisfiable  then  the  language  L(G)  is  nonempty 
and  any  element  of  L(G)  can  be  used  to  construct 
a proof  of  the  unsatisfiability  of  S. 

Proof  Assume  S unsatisfiable.  Then  there  exists  a 
refutation  r of  S by  resolution[3] . By  Thm  3 we 
know  that  r can  be  mapped  onto  an  element  s of 
L(G)  and  L(G)  must  therefore  be  nonempty.  Further- 
more, by  Thm  4,  s can  be  used  to  find  a proof  of 
the  unsatisfiability  of  S. 

Note:  The  terminals  making  up  the  string  s 
are  edge  names.  Those  edge  names  may  be  used  to 
represent  resolution  steps  that  collectively  will 
generate  the  empty  clause. 

General  case  In  deriving  a string  in  a context- 
free  language,  any  production  may  be  applied  when- 
ever the  current,  derived  string  contains  the  non- 
terminal that  is  the  l.h.s.  of  that  production. 

Me  shall  now  define  a similar  type  of  grammar  in 
which  application  of  productions  is  further  re- 
stricted. 

A context-free  attribute  grammar  is  a context- 
free  grannar  in  which  the  productions  are  replaced 
by  production-attribute  pairs:  (P,A)  where  P is  a 
production  and  A Is  a predicate.  When  applying  a 
production  the  corresponding  attribute  must  be 
true. 


E.g.,  G » <(S),  (0,1),  P_A,  S>  where  P_A  consists 
of  tfie  two  elements: 


1) 


P 

OSl 


derived  string  Is  of  length 
less  than  or  equal  to  6 


2)  S 01  True 

Then  L(G)  - (01,  0011,  000111,  00001111} 

The  method  for  constructing  a grammar  In  the 
general  case  Is  basically  the  same  as  that  in  the 
ground  case  except  that  we  add  attributes  to  the 
productions.  Each  attribute  corresponds  to  the 
substitution  that  must  be  made  at  that  step  and  its 
compatibility  with  the  substitutions  that  have  al- 
ready been  made  in  the  derivation. 

Assume  a given  CIG.  Find  all  merge  loops. 

Now  construct  a context-free  attribute  grammar, 
<Nonterminals,  Terminals,  P_A,  S>  where: 

S is  *a  new  symbol,  i.e.,  S f Nodes  U Edges 
Nonterminals  * (S}  U (N1|  N € Nodes,  i>0} 
Terminals  = Edges 

P_A  » (P,A) 

1)  P Is  S -♦  LI.  ...  Lk.  for  each  start  clause 
(Ll,...,Lk)'3  A is  ■^true.  The  empty  substi- 
tution is  the  substitution  of  the  newly  de- 
rived string,  or 

2)  P is  B^  •*  e Clj...Ckj  where  e » <B,C>  ( 

Edges,  Residual  literals(C)  « {C1...Ck} 

A Is  true  Iff  Is  defined  where 

a denotes  the  accumulated  substitution  of 
the  current,  derived  string  and  a=  Subst(e). 
8'  is  the  substitution  of  the  newly  derived 
string,  or 

3)  P Is  B^  -►  m L1j...Lkj  where  m is  a merge 

loop,  (Ll,...Lk)  are  the  undeleted  literals 
of  m.  A Is  as  In  case  2 except  that 
a « Sub(m). 

The  indexing  of  the  nonterminals  is  used  to 
keep  track  of  different  copies  of  the  same  variable. 
If  more  than  one  instance  of  a clause  is  used  in 
the  proof  then  the  variables  in  those  clauses  must 
be  distinguishable.  We  assume  that  variables  in 
different  clauses  already  have  different  names,  so 
that  the  only  possible  ambiguity  arises  from  mult- 
iple instances  of  a given  clause. 

Production-attribute  pairs  are  really  temp- 
lates.t If  nonterminal  B^  appears  In  a derived 

string,  then  we  may  use  the  template 
e Clj...Ckj 

to  create 

B,.  - e Cl  ..  .Ck 
n n n 

where  n Is  an  integer  constant  not  previously  used 

+ The  number  of  production  templates  is  finite,  but 
the  number  of  actual  productions  obtainable  will  be 
Infinite.  It  would  have  been  possibfe,  however,  to 
limit  ourselves  to  a finite  set  of  productions  and 
nonterminals  by  complicating  the  attributes. 
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•s  1 nonterminal  subscript.  This  production  may 
now  be  applied  to  the  derived  string  s If  the 
attribute  of  the  production  Is  true,  I.e.,  If 
$ubst1tut1on(s)Oa|^^  Is  defined. 

For  example,  consider  the  CI6  C of  Figure  8. 
Again.  If  e Is  edge  <X,Y>  let  e'  denote  <Y,X>;  If 
Subst(e)  ■ Subst(e')  » a^2  si' 

granmar  constructed  from  C Is  G = <{S,A,B,C,D,E,F, 
S,H,I,J},  n,2,3,4,5,l',2',3',4',5'},P  A,S>  where 
P_A: 

~ P A:  true  If  b'  ■ BOa  defined 

where  a Is 


3)  - 1 


- 3 Gj 
-►  4 

- 5 

- r A, 


[f(x)/y], 


[f{a)/tJ/^) 

[b/w,a/vj|^ 

[f(x)/y],,  V 


tb/w,a/v](w)(,) 


Edge  e 


[b/w,a/v]-  w^, 


Figure  8:  A CIG  with  variable 
lists  attached  to  clause 
partitions 

L(G)  equals  the  single  element  set  (23451). 

The  productions  In  our  constructed  attribute 
granmar  tell  us  how  to  generate  strings,  their 
corresponding  derivation  trees  and  associated  sub- 
stitutions, top-down.  In  order  to  prove  the  com- 
pleteness theorem  for  the  general  case,  we  need  to 
define  the  bottom-up  construction  of  substitutions 
of  derivation  trees.  The  following  Is  such  a def- 
inition. Proof  of  equivalence  of  this  to  the  orig- 
inal definition  is  simple  and  left  to  the  reader. 

For  the  derivation  tree  shown  In  Fig.  9(1), 
Sub(0(B,))  • Subst(e)^  Sub(0(Cl , ))  O 

Sub(D(Ckj)). 

For  the  derivation  tree  shown  In  Fig.  9(11), 
Sub(D(BJ)  • Sub(m).  . GSub(D(Ll , ) ) O ...  G 

Sub(D(Uj)). 

For  the  derivation  tree  shown  In  Fig.  9(111), 


Sub(D(S))  - Sub(D(Llj))G...OSub(D(Lkj)), 


e D(Clj)...D(Ckj)  m D(Llj)...D(Lk  ) 

(1)  (11) 


0(|.lj)...D(Lkj) 

(111) 

Figure  9:  All  possible  configurations 
of  derivation  trees 

Thm  6.  (General  case.  Completeness  and  Soundness) 
The  set  of  solution  trees  of  a CIG  Is  equivalent 
to  the  set  of  derivation  trees  for  the  language 
of  the  attribute  grammar  constructed  from  the  CIG. 

Proof  Consider  first  the  ground  structure,  i.e. 
the  CiG  without  substitutions  and  the  grammar  with 
the  attributes  ignored.  In  this  case,  productions 
enjoy  unrestricted  use.  Looking  only  at  the 
ground  structure,  we  are  reduced  to  the  situation 
of  thms  1 and  2.  Therefore  the  ground  structure 
of  the  two  are  equivalent.  The  general  case  puts 
restrictions  on  those  ground  structures.  What  we 
need  to  show  Is  that  the  restrictions  allow  the 
equivalent  structures  to  be  admitted  at  the  gener- 
al level . 

For  the  same  case  breakdown  as  before,  we 
shall  show  that  the  structures  the  two  systems  ad- 
mit are  equivalent  and  have  equivalent  substitu- 
tions. The  proof  will  be  by  Induction  on  the 
depth  of  the  trees.  The  basic  steps  are  the  spec- 
ial Instances  of  cases  1 and  2 In  which  k«0. 


L (<N,L>  • a)\  ^ ^ 

0(Ll,)...0(Lk.) 
T(L1)  ...  T(Lk)  J J 

Assume  by  the  Induction  hypothesis  that  T(L1) 

T(Lk)  and  D(L1 ) , . . . ,D(Lk)  are  admitted  by  their 
respective  systems,  and  that  Sub(T(Lh))  * 
Sub(D(Lhj))  with  a possible  change  of  variables 

for  Is  hs  k.  Then  let 
a « Sub(T(N))  « 

Subst(<N,L>).  j OSubj  (T(L1 ) )0  . . . CSubj  (T(Lk) ) 

8 • Sub(D(N^))  * 

Subst(<N,L>)^  jOSub(D(Llj))O...GSub(D(Lkj)). 

For  Is  hs  k,  all  variables  In  Sub(D(Lhj))  are 

subscripted  since  every  substitution  applied  un- 
der the  grammar  Is  fully  subscripted;  furthermore, 
all  variables  of  Lh.  are  subscripted  by  J.  All 
variables  of  Sub(T(th))  are  subscripted  except 
for  those  In  Lh,  since  by  the  definition  of  dele- 
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tion  tree,  before  forming  Sub(T(Lh))  we  subscript 
•11  unsubscripted  variables.  Subj(T(Lh))  then 

equals  Sub(T(Lh))  with  all  variables  of  Lh  now  sub- 
scripted. Therefore,  If  by  the  Induction  hypothe- 
sis, Sub(0(Lhj))  equals  Sub(T(Lh))  up  to  a change, 

cv,  of  variables,  then  Sub{D(Lhj))  equals 
Subj{T(Lh))  up  to  cv',  which  Is  equal  to  cv  minus 
the  changes  Involving  unsubscripted  variables. 

Therefore,  Sub(0(N))  and  Sub(T(N))  are  either 
both  or  neither  defined,  and  Sub(0(N))  » Sub(T(N)) 
with  cv'  plus  change  In  Sub(T(N))  the  unsubscripted 
variables  x of  N to  x.|. 


Case  2 N N 


T(L1)...T(Lk) 


a « Sub(T(N))  • 


Sub(m)_  j&Subj(T(Ll))O...OSubj{t{Lk))  and 
B • Sub(0(N))  • 


Sub(m)^  jCSub(D(Llj))O...OSub(D(Lkj)) 


Sub(DlLhj))  » Subj{T(Lh))  with  change  cv'  for 

Ishsk  as  In  Case  1.  Therefore  Sub(0(N))  and 
Sub(T(N)1  are  either  both  or  neither  defined,  and 
SubjDlNn  » Sub(T(N))  with  cv'  plus  change  In 
Sub(T(N))  the  unsubscripted  variables  x of  m to  x^. 


T(L1)...T(Lk)  D(Llj)...D{Lkj) 
a • Sub(Ts(0)  • Subj(T(Ll))O...OSubj(T(Lk)) 
B - Sub(D(S))  ■ Sub(0(Llj))e  ...GSub(0(Lkj)) 


and 


Sub(0(Lhj)]  • Subj{T{Lh))  with  change  cv'  for 

1*h*k  as  In  Case  1.  Therefore  Sub(D(S))  and 
Sub(Ts(C))  are  either  both  or  neither  defined,  and 
Sub(D(S))  = Sub(Ts(0)  with  change  cv'. 


3.  Grammar  G to  Language  1(G) 


c)  No  sequence  of  symbols  not  satisfying  a) 
or  b)  above  Is  a regular  expression. 

Parentheses  are  used  to  enclose  subexpressions, 
"A|B"  Is  used  to  denote  the  choice  of  A or  B,  "AB" 
to  denote  concatenation  of  regular  expressions 
A and  B,  and  "A*"  denotes  that  A Is  repeated  zero 
or  more  times. 

E.G.  for  grannar  G: 

S ♦ 0 S A - 1 B 

S 1 A B - 2 

A t 1 A B - 3 

L(G)  Is  represented  by  the  regular  expression 
0*  1 1 1*  (2|3),  I.e.  any  element  of  L(G)  consists 
of  zero  or  more  "0'"s  followed  by  two  or  more 
"T"s  followed  by  either  a "2"  or  a "3". 

Context-free  grammars  sometimes  generate 
languages  that  are  not  representable  by  regular 
expressions.  For  example,  grammar: 


generates  language:  (”  e )",  reO,  I.e.  all  strings 
that  consist  of  zero  or  more  open  parens  followed 
by  "e"  followed  by  the  same  number  of  close  parens 
as  open  parens.  There  Is  no  way  to  express 

(”  e )”  strictly  as  a regular  expression  since  the 
only  repetition  operator,  does  not  carry  a 
count  that  can  be  matched.  E.g.,  (*  e )*  would 
allow  "(((e)".  In  order  to  eliminate  this  dif- 
ficulty, we  add  to  the  regular  expression  notation 
the  positive  Integer  exponent.  We  define  a reg- 
ular algebra  R that  admits  the  following  expres- 
sions: 

a)  Every  terminal  symbol  Is  In  R. 

b)  If  A and  B are  In  R,  then  (A),  A|B,  AB, 

A*  and  a"  are  In  R. 

c)  No  sequence  not  satisfying  a)  or  b)  Is  In  R. 
The  addition  of  exponents  Is  essential  since  the 
Inherent  power  of  context-free  grammars  allows 
balanced  bracketing  of  expressions.  Some  context- 
free  grammars  naturally  create  expressions  In  R. 
Several  simple  operations  can  be  used  to  generate 
the  expression  In  those  cases.. 

1)  Back-substitution.  ■ If  for  any  nonterminal 
A all  productions  having  A as' .the  l.h.s.  have  no 
nonterminals  on  the  r.h.s.,  then  replace  all  ref- 
erences to  A by  the  alternative  terminal  strings 
or  expressions  In  R that  A derives.  Then  remove 
all  productions  having  A as  the  l.h.s..  E.g., 


In  the  theorem  proving  application  just  des- 
cribed, a context-free  grammar  Is  defined  whose 
language  represents  proofs  of  a theorem.  A des- 
cription of  that  language  Is  a description  of  the 
set  of  objects  we  desire.  We  describe  a concise, 
closed  form  and  discuss  how  to  derive  It. 

In  the  case  of  regular  grammars  (a  special 
case  of  context-free  grammars),  we  can  describe 
the  generatable  language  as  a regular  expression. 
Regular  expressions  are  defined  as  follows: 

a)  Every  terminal  symbol  Is  a regular 
expression. 

b)  If  A and  B are  regular  expressions,  then 
(A),  A|B,  AB,  and  A*  are  also. 


A •»  a 

A b*  c - B a (a  b*c)c 

B -►  a A c 

2)  Simple  recursion.  If  for  any  nonterminal 
A the  only  productions  having  A as  the  l.h.s.  are 
of  the  form: 

type  1 ) A t1 , Isisn  or 

type  2)  A -►  11  A,  Isisk  or 

(type  3)  A • A r1 , Isisj 

where  t1 , 11,  and  r1  represent  expressions  In  R, 
where  n^l,  k,j  s 0,  then 

A * (1tl...|1k)*  (t,|...|tn)  (r,|...|rj)*. 

Replace  every  reference  to  A In  other  productions 
by  this  expression  representing  strings  derivable 


s 
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from  A.  Eliminate  all  productions  haying  A as  the 
l.h.s..  E.g., 

A A 0 
A - b 1*  A 
A - 2 A 

A •*  a - S •»  0 (b  1*  1 2)*  (alb)  0* 

A ♦ b 
S •*  0 A 

3)  Internal  recursion.  If  for  any  nonterminal  A, 
the  only  productions  having  A as  the  l.h.s.  are  of 
the  form: 

(type  1 ) A -*■  t1 , isisn 
(type  2)  A -►  1 A r 

where  t1,  1 and  r represent  expressions  in  R,  where 
ngl,  and  there  Is  a single  production  of  type  2, 
then  ^ ^ ^n  |...|tn)  r". 

Replace  A as  before  and  eliminate  the  above  pro- 
ductions. E.g., 

A (a|b) 

A I oVl  ' B b O"  (1*  2 (a|b))  l"  1 

B b A I 

The  above  three  replacement  rules  will  not 
suffice  for  all  context-free  grammars.  For  example 
the  graimar  6: 

S - 0 S 1 S H.  s 3 

S -►  2 S S a 

has  language  (0|2)*  a (1|3)*  such  that  the  number 
of  O's  equals  the  number  of  I's.  We  cannot, 
strictly  speaking,  represent  that  by  our  regular 
algebra.  However  we  can  loosen  the  ordering  re- 
striction on  the  r.h.s.  of  productions  since  the 
r.h.s.'s  represent  subgoals  to  be  solved  and  the 
order  Is  unimportant.  This  allows  us  to  represent 
the  language  as  (01 |2)*  a 3*.  A later  paper  will 
discuss  this  ordering  relaxation  and  double  recur- 
sion, e.g.  A 0 A 1 A, 

Loosening  the  ordering  restriction  also  allows 
the  elimination  of  some  redundancy.  The  first 
grammar  derived  from  a CIG  that  appears  in  this 
paper  generates  the  following  language:  (abcdd, 
c'b'a'dd,  abed,  ebad).  Allowing  reordering  and 
representing  edges  <X,Y>  and  <Y,X>  similarly,  then 
the  set  reduces  to  f abcdd,  abed).  For  further 
discussion  of  minimizing  the  set  representing  the 
proof  schemata,  see  [7]. 

4.  Conclusions 


function  computed  by  the  program. 

Studies  are  underway  to  apply  this  regular 
algebra  representation  of  logic  specifications. 
Some  of  the  areas  presently  being  considered  are: 
1j  program  synthesis 

2)  plan  formation 
3 question-answering 

4)  machine  learning. 
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We  have  described  a method  that  represents 
proofs  In  predicate  logic  by  a formal  language. 
This  language  Includes  the  full  set  of  proofs  for 
a given  theorem.  If  we  can  describe  the  language 
with  a regular  algebra  then  we  have  a closed  form 
for  a possibly  infinite  set  of  proofs.  This  rep- 
resentation gives  an  analysis  of  the  flow  of  the 
derivation.  This  analysis  can  be  used  on  programs 
specified  in  logic  to  describe  required  execution 
flow  of  the  program  that  leads  to  termination  with 
the  proper  result. 

The  regular  algebra  representation  can  also 
bo  used  to  analyze  the  values  that  variables  may 
have  If  the  derivation  Is  to  terminate.  In  the 
logic  program  application,  this  analysis  will  be 
used  to  clarify  the  possible  values  of  input  and 
output  variables,  I.e.  the  domain  and  range  of  the 
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